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GENERALIZED  NEWTON  METHOD  FOR  ENERGY 


FORMULATIONS  IN  IMAGE  PROGESSING 

LEAH  BAR  AND  GUILLERMO  SAPIRO* 

Abstract.  Many  problems  in  image  processing  are  addressed  via  the  minimization  of  a  cost 
functional.  The  most  prominent  optimization  technique  is  the  gradient  descent,  often  used  due  to  its 
simplicity  and  applicability  where  other  techniques,  e.g.,  those  coming  from  discrete  optimization, 
can  not  be  used.  Yet,  gradient  descent  suffers  from  a  slow  convergence,  and  often  to  just  local  minima 
which  highly  depends  on  the  condition  number  of  the  functional  Hessian.  Newton-type  methods,  on 
the  other  hand,  are  known  to  have  a  rapid,  quadratic,  convergence.  In  its  classical  form,  the  Newton 
method  relies  on  the  L^-type  norm  to  define  the  descent  direction.  In  this  paper,  we  generalize 
and  reformulate  this  very  important  optimization  method  by  introducing  a  novel  Newton  method 
based  on  more  general  norms.  This  generalization  opens  up  new  possibilities  in  the  extraction  of  the 
Newton  step,  including  benefits  such  as  mathematical  stability  and  the  incorporation  of  smoothness 
constraints.  We  first  present  the  derivation  of  the  modified  Newton  step  in  the  calculus  of  variation 
framework  needed  for  image  processing.  Then,  we  demonstrate  the  method  with  two  common  objec¬ 
tive  functionals:  variational  image  deblurring  and  geodesic  active  contours  for  image  segmentation. 
We  show  that  in  addition  to  the  fast  convergence,  norms  adapted  to  the  problem  at  hand  yield 
different  and  superior  results. 

Key  words.  Newton  method,  variational  methods,  trust-region,  generalized  inner  product, 
active  contours,  deblurring. 

AMS  subject  classifications.  35A15,  65K10,  90C53 

1.  Introduction.  Optimization  of  a  cost  functional  is  a  fundamental  task  in 
variational  image  analysis,  where  the  most  widely  used  optimization  techniques  are 
based  on  gradient  flows.  In  the  popular  iterative  gradient  descent  method,  the  de- 
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scent  step  or  search  direction  is  given  by  the  negative  gradient,  and  the  functional 
is  progressively  (iteratively)  minimized  advancing  in  this  direction.  The  definition  of 
the  gradient  relies  on  an  inner  product  structure,  and  in  most  studies  the  L^-type 
inner  product  is  implicitly  assumed. 

Recently,  generalized  gradient  descent  approaches  were  introduced  in  image  anal¬ 
ysis  by  defining  different  inner  product  types.  Sundaramoorthi  et  al.,  [19,20]  (see 
also  [21]),  formulated  the  generic  geometric  active  contour  model  by  redefining  the 
gradients  with  Sobolev- type  inner  products.  As  a  result,  improvement  in  region-based 
and  edge-based  segmentation  was  accomplished,  and  important  ill-posed  flows  became 
well-possed.^  Charpiat  et  al.,  [8],  derived  the  general  gradient  descent  process  asso¬ 
ciated  with  a  symmetric  positive  linear  operator  which  defines  a  new  inner  product. 
They  demonstrated  that  the  choice  of  the  inner  product  can  be  considered  as  a  prior 
on  the  deformation  fields  in  shape  warping  and  tracking  applications.  Related  to  this 
work,  Eckstein  et  al.,  [10],  showed  the  importance  of  the  norm  selection  in  the  context 
of  shape  matching. 

The  major  weakness  of  the  gradient  descent  method  is  that  despite  its  simplicity, 
the  convergence  rate  can  be  very  poor,  many  iterations  are  needed  to  achieve  a  (local) 
minima.  On  the  other  hand,  it  is  well  known  in  optimization  theory  that  Newton 
methods  are  much  faster,  with  a  quadratic  convergence  [4,5]. 

Let  us  illustrate  how  the  Newton  method  computes  the  descent  direction  for 
the  case  of  functions,  we  will  later  work  on  functionals  of  the  type  used  in  image 
processing.  Let  f{x)  :  ^  M  be  a  real-valued  function.  The  second  order  Taylor 


^It  is  important  to  note  that  the  change  of  inner  product  does  not  change  the  energy  and/or 
its  local  and  global  minima.  It  produces  a  different  minimization  path,  which  might  then  end  at  a 
different  stationary  point  of  the  energy  (or  its  numerical  approximation). 
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approximation  /(x  +  d)  of  /  at  a;  is  given  by 


f{x  +  d)  «  f{x)  +  \7f'^{x)d  +  Hessian/(a;)  d, 

where  the  second  and  third  terms  on  the  righthand  side  are  the  first  and  second  direc¬ 
tional  derivatives  of  /  at  a;  in  the  direction  d  G  This  expression  approximates 

the  change  in  /  for  a  small  step  d.  Minimizing  this  quadratic  approximation  with 
respect  to  d,  yields  the  Newton  step  d  as  the  solution  to  the  equation 

Hessian/(a:)d  =  —\7f{x). 

The  solution  will  attain  a  minimum  if  the  Hessian  is  positive  definite.  In  the  case  that 
/  is  (locally)  nearly  quadratic,  f{x  +  d)  is  a  very  good  estimate  for  the  minimizer  of 
/.  The  damped  Newton  method  refers  to  the  case  where  the  minimizer  is  iteratively 
updated  as  f{x  +  Atd),  where  At  is  selected  via  a  line  search  process.  In  the  pure 
Newton  method,  the  time  step  is  fixed  to  At  =  I  [4, 5] . 

The  main  disadvantages  of  the  Newton  method  are  the  cost  of  the  calculation  of 
the  Hessian  (while  computing  the  descent  direction  d),  and  the  computation  of  the 
Newton  step,  which  require  solving  a  set  of  linear  equations.  Still,  Newton  methods 
are  overall  faster  than  gradient  descent.  The  method  may  be  attracted  to  a  local 
maximum  or  saddle  point  in  regions  where  the  Hessian  is  not  positive  definite.  In  the 
case  of  indefinite  Hessian,  a  downhill  search  direction  can  be  obtained  by  solving 

[XI  +  Hessian/(a;)]  d  =  —  V/(a;). 

^While  standard  gradient  descent  is  based  only  on  first  order  derivatives,  we  already  see  from 
this  expression  that  second  variations  are  part  of  the  Newton  method. 
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With  a  sufficiently  large  value  of  A  G  M"*",  [\I  +  Hessian/(a;)]  is  positive  definite.  It 
can  be  shown  that  the  solution  to  this  equation  is  as  the  solution  of 

f{x  +  d)  =  f{x)  +  \7f'^{x)d  +  ^d^  Hessian/(a;)  d,  s.t.  ||(i|j  <  A, 

where  the  optimization  is  now  subject  to  an  upper  bound  of  the  size  of  d.  Minimiza¬ 
tion  with  the  above  equation  is  known  as  the  trust-region  method,  [9].  With  this 
modification,  the  solution  is  proved  to  converge,  the  cost  functional  is  nonincreas¬ 
ing  at  every  iteration.  Furthermore,  the  computational  cost  of  the  calculation  of  the 
descent  direction  d  can  be  reduced  by  the  trust-region  constraint  [9]. 

In  the  calculus  of  variation  framework,  where  functionals  take  the  place  of  func¬ 
tions,  the  derivative  is  replaced  by  the  first  Gateaux  variation  and  the  Hessian  is 
replaced  by  the  second  Gateaux  variation.  Few  variational  image  processing  studies 
had  used  the  Newton  method  for  optimization.  Hintermiiller  and  Ring,  [11],  solved 
the  segmentation  of  grey  scale  images  by  the  minimization  of  the  Mumford-Shah 
functional,  [13],  via  Newton-type  methods.  Zhang  and  Hancock,  [23],  developed  an 
edge-preserving  filter  for  smoothing  images  whose  features  reside  on  a  curved  mani¬ 
fold,  and  optimize  it  via  Newton-type  methods.  Both  works  use  standard  norms. 
Part  of  the  contributions  of  the  present  paper  is  to  develop  Newton-type  method  with 
more  general  norms.  This  is  inspired  in  part  by  the  above  mentioned  extensions  of 
gradient  descent  methods  beyond  norms. 

In  the  work  of  Absil  et  al.  [1],  the  authors  proposed  and  analyzed  the  trust-region 
Newton  method  on  Riemannian  manifolds.  The  Riemannian  metric  induces  a  norm 
on  the  tangent  space  which  is  used  explicitly  in  the  Newton  method.  The  algorithm  is 
illustrated  on  problems  from  numerical  linear  algebra  with  well  defined  Riemannian 
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structure.  While  this  study  is  close  to  the  work  presented  in  this  paper,  here  we  focus 
on  (image)  functionals  in  the  variational  framework  in  abstract  Euclidean  space.  In 
addition,  we  show  that  the  norm  or  inner  product  can  be  adaptively  changed  during 
the  iterative  minimization  process,  yielding  modified  results. 

In  this  paper  we  derive  a  generalized  Newton  method  based  on  a  general  norm 
in  the  calculus  of  variations  framework.  We  thereby  enjoy  the  advantages  of  Newton 
methods,  such  as  fast  convergence,  while  providing  the  flexibility  to  adapt  the  metric 
to  the  problem  at  hand.  We  begin  by  reviewing  the  necessary  conditions  for  functional 
minimum  by  the  theory  of  the  second  variation,  Section  2.  We  proceed  by  presenting 
the  quadratic  approximation  of  the  functional,  which  is  a  critical  step  in  the  Newton 
method,  and  continue  with  the  mathematical  derivation  of  the  generalized  Newton 
step.  Section  3.  Numerical  simulations  demonstrate  the  performance  of  the  algorithm 
for  image  segmentation  (Section  4)  and  deblurring  (Section  5).  For  each  one  of  these 
examples,  we  propose  new  metrics,  including  adaptive  ones.  We  show  that  although 
the  classical  Newton  method  is  computationally  very  efficient,  the  results  can  be 
quite  poor.  By  using  different  norms  in  the  proposed  generalized  method,  state-of- 
the-art  results  are  obtained  with  the  additional  advantage  of  high  convergence  rate. 
Furthermore,  given  a  highly  noisy  image  for  example,  the  segmentation  procedure 
tends  to  fail  with  the  classical  gradient  descent  method  as  pointed  out  in  [8, 19]. 
Choosing  an  appropriate  norm  in  the  proposed  framework  alleviates  the  problem 
sensitivity  and  yields  improved  results  at  faster  convergence  rates.  Theoretical  results 
on  the  proposed  segmentation  and  deblurring  formulations  obtained  by  using  the 
proposed  new  metrics  in  the  Newton  methods  are  presented  as  well  in  sections  4 
and  5.  Finally,  Section  6  concludes  the  work. 
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2.  Necessary  Conditions  for  Minimum.  Many  problems  in  image  processing 
are  solved  via  the  minimization  of  a  cost  functional.  In  this  section  we  review  some 
necessary  conditions  for  relative  minimum  values  of  functions  of  real  variables.  The 
first  necessary  condition  is  known  as  the  Euler-Lagrange  equation(s).  However,  this 
condition  is  satisfied  for  maximum  and  saddle  points  as  well,  and  therefore  additional 
conditions  are  necessary  for  minimum  values. 

We  focus  on  both  the  Legendre  and  Jacohi  conditions  that  are  derived  from  the 
theory  of  the  second  variation  [15].  The  Legendre  condition  is  satisfied  whenever  a 
corresponding  sub-Hessian^  of  the  functional  is  positive  definite."^  Later  in  the  paper, 
we  will  optimize  functionals  which  do  not  necessarily  satisfy  this  condition.  We  will 
therefore  present  the  generalized  Newton  method  with  trust-region,  where  the  sub- 
Hessian  can  be  arbitrary,  while  satisfying  this  necessary  condition. 

Consider  the  minimization  problem  for  the  following  functional 

T{f)  :=  [  I{x,f{x),\/f{x))dx, 

Jq 

where  we  assume  that  I  G  (7^(3?),  3?  being  a  domain  in  (x, /(x),  V/)  space  for  /  G 
(7^(11),  and  {x,  f{x),V f{x))  G  3?  for  all  x  G  (fl  being  a  region  in  Let  ip 

denote  the  functional  variation  in  a  domain  LI  such  that  it  is  zero  at  the  boundary 
(see  below) .  The  first  necessary  condition  for  a  relative  extremum  of  the  functional  is 


■=  Ye  •^(/  +  £^) 


=  0, 


ViA  G  9, 


e=0 


(2.1) 


^The  sub-Hessian  is  a  sub-matrix  of  the  functional  Hessian. 

While  in  real-valued  functions,  f'{xo)  =  0  and  f"(xo)  >  0  are  sufficient  conditions  for  a  relative 
minimum  at  x  =  xq,  additional  conditions  have  to  be  satisfied  in  the  case  of  functionals  [15]. 

^We  now  consider  /  a  scalar  function,  while  extensions  to  vector-valued  functions  are  possible  as 
well. 
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where  £  G  M'*',  9  =  {^  |  V'  G  V'lan  =  0},  and  p^(ri)  is  the  space  C^(ri)  in 

which  the  norm  is  defined  as 


blip  :=  max|5(x)|  +m^|V5(x)|. 


The  expression  ip{tp)  is  denoted  as  the  first  Gateaux  variation,  and  Equation  (2.1) 
leads  to  the  basic  Euler-Lagrange  equations  (which  form  the  core  of  the  gradient 
descent  minimization  approach).  As  we  mentioned  in  the  introduction,  this  is  only 
the  first  necessary  condition  for  a  relative  minimum.  The  additional  necessary  (but 
not  sufficient)  condition  of  the  second  variation  is  given  by  [15]: 

>  0. 

£=0 


Let  Xi  G  {i  =  1,2,  •••fV)  be  the  variables  set  of  the  the  function  /,  where  N  is 
the  dimension  of  the  domain.  In  addition,  let  denote  the  partial  derivative  of 
the  function  /  with  respect  to  the  variable  Xi,  and  If  the  partial  derivative  of  the 
functional  I  with  respect  to  /.  Hence,  the  second  variation  takes  the  form 

'  N  N  \ 

ifti.i’i’xi  I  (2-2) 

i,j=i  i=i  j 

Assume  for  simplicity  that  N  =  l,x  =  Xi,  and  let  R{x)  :=  If^f^,  Q{x)  :=  Iff^,  and 
P{x)  :=  Iff.  In  addition,  assume  a  regular  extrema,  where  R{x)  yf  0,  Vx  G  fl.  By 
the  Legendre’s  transformation  of  the  second  variation,  (2.2)  can  be  rewritten  as 


Q(x)  +  u(x) 
R(x) 


f>{x) 


dx. 


Vug  Ci(H). 
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Jacobi  proposed  to  introduce,  instead  of  u(x),  a  new  function  r]{x)  G  by  means 


of  the  substitution 


u(x)  =  -R(x)  -  Q(x), 

r]{x) 

which  leads  to  the  Jacobi  equation 

4-  (RVx)  +  (Qx  -  P)v  =  0.  (2.3) 

ax 

Lemma  1.  If  the  Jacobi  equation  (2.3)  has  a  solution  ri{x)  yf  0  for  all  x  G  fl,  and 
if  R{x)  >  0  for  all  X  G  (strengthened  Legendre  condition),  then  (p‘^(gp)  is  positive 
definite. 

Proof.  Lemma  7.4.1  in  [15].  □ 

In  the  functionals  we  consider  in  sections  4  and  5,  the  image  dimension  N  is  greater 
than  1,  and  therefore  the  (strengthened)  Legendre  condition,  {R{x)  >  0)  R{x)  >  0, 
is  generalized  to  the  condition  that  the  matrix  R  :=  is  positive  definite.  We 

will  show  that  despite  the  fact  that  the  Legendre  condition  is  not  satisfied  in  our 
examples,  the  functionals  can  still  be  optimized  by  the  suggested  generalized  Newton 
with  trust-region  method. 

3.  Generalized  Newton  Method  Derivation.  In  this  section,  we  derive  the 
generalized  Newton  optimization  method  in  a  variational  framework,  with  general 
metrics  and  additional  trust-region  constraints.  The  contribution  of  the  method,  on 
top  of  the  known  computational  efficiency  of  Newton-type  methods,  is  mainly  achieved 
by  the  flexible  formulation  of  the  inner  product.  Different  selections  of  inner-products, 
adapted  to  the  application  at  hand,  yield  different  and  improved  solutions  to  the  mini- 


mization  problem,  by  progressing  via  a  different  minimization  path.  The  incorporated 
trust-region  stabilizes  the  solutions  in  the  case  that  the  Legendre  condition  detailed 
above  is  not  satisfied. 

The  second  order  Taylor  expansion  of  the  common  cost  functional 

J^if)  :=  [  I{x,f{x),\/f{x))dx, 

JQ 

motivates  the  Newton’s  method.  Let  /  be  the  estimation  of  the  (local)  minimizer 
of  this  functional.  The  quadratic  approximation  to  the  variation  +  ip)  with  the 
trust-region  constraint  is  given  by 


Q(V>)  :=  T{J)+  <  VfTU)  I  V;  >  ^  I  V’  >,  s.t.  ||^||  <  A,  (3.1) 


where  p)  €  A  denotes  the  trust-region  radius,  and 

Tfy  :=  Hessian  J^(/).  (3.2) 

The  notation  <  -  j-  >  stands  for  the  inner  product  such  that 

hWl^a)  =  <9\9>- 

In  the  sequel,  we  will  alternately  use  the  Hessian  and  second  variation  notions  since 

=<  I  V”  >  ■ 

The  minimizer  of  Q{tp)  in  Equation  (3.1)  gives  the  Newton  step  direction  which 
decreases  the  functional  value  iF{f  pj)  towards  the  relative  minimum. 
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The  first  variation  is  given  by 


<  V/^(/)  I  ^  >= 


N 


dx, 


while  the  second  variation  is  given  by  Equation  (2.2).  In  the  important  case  for  image 
processing  of  =  2  {xi  =  x  and  X2  =  y),  the  integrand  of  Equation  (2.2)  can  be 
expressed  in  quadratic  form, 


'tp  i’x  4>y 


a  d  e 
d  b  f 

\‘  >  ‘J 


i’x 

y  i’y  y 


(3.3) 


where  a  :=  Iff,  b  :=  If^f^,  c  :=  If^f^,  d  :=  Iff^,  e  :=  ///^,  and  /  :=  If^f^. 


Let  us  now  extend  the  above  formulation  to  more  general  metrics.  Consider  an 
abstract  infinite  dimensional  Euclidean  space  -  a  vector  space  endowed  with  an  inner 
product  such  that,  e.g.,  [8,19], 


<  u  I  u  >c=<  Cu\  V  >, 

where  C  :  ^  is  a  symmetric  positive  definite  linear  operator  with  the  domain 

and  range  equal  to  the  space  [8] . 

In  the  proposed  generalized  Newton  method,  the  critical  second  order  Taylor 
expansion  is  formulated  in  this  abstract  Euclidean  space  with  a  general  inner  product, 
subject  to  the  corresponding  trust-region  constraint, 

mi-ip)  :=  T{f)+ <V fT{f)  \  Ip  >c \ -ip  >c  s.t.  [jV’IU  <  A,  (3-4) 
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where 


llV'lli  =<  /iV’  I  ^  >  ■ 

The  new  ,  metric-dependent,  Newton  step  ipN  now  the  minimizer  of  this  truncated 
Taylor  expansion.  The  minimization  of  m^ip)  with  respect  to  ■0  is  carried  out  using 
the  first  Gateaux  derivative, 


9  r  , 

—  mi'ip  +  srj) 
oe 


=  0,  ?7  G  9. 


£=0 


By  the  linearity  of  the  operator  C, 


i(0  -h  er])  =T{f)  +  J  ^  (^/•^(/))  i^  +  sv)dx+^  J  C  {Uj  (0  -h  e??))  (0  -f  erj) 

=-^(/)  +  j  f))  {'4’  + sv)dx+]^  J  +eC(Hjr)^  {-ip  +  er]) 


dx. 


Hence, 


de 


m{ip  +  srj) 


e=0 


J  ^  (^/•^(/))  'ndx+^  J  L  (Hj:  0)  r?  -f  £  (Uf  77)  0 


dx. 

(3.5) 


The  integral  equation  (3.5),  therefore,  takes  the  form 


<  V/J^(/)  I  ?7  >c  <  'Hfip  I  ?7  >c  +1  <  0  I  Hfi]  >c=  0,  s.t.  11011c  <  A. 


(3.6) 
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Using  the  quadratic  form  (3.3)  yields 


<  I  77  >£  =  <  C  I  11  > 

=  /  £  {ail)  +  dil)x  +  eil)y)  77  +  £  {dip  +  bipx  +  fipy)  Vx  (3-7) 

Jn 

+  £  {eip  +  fipx  +  cipy)  r]y  dx. 

In  the  same  fashion, 

<  Ip  I  Hjri  >c=  <  C{ip)  I  Kjip  > 

=  J  {ar]  +  drjx  +  er]y)  £{ip)  +  {drj  +  br]x  +  fVy)  ^{i^x)  (3-8) 

+  (e??  +  fVx  +  crjy)  C{ipy)  dx. 

By  substituting  (3.7)  and  (3.8)  into  (3.6),  and  using  integration  by  parts  and  the 
fundamental  lemma  of  calculus  of  variations,  we  end  up  with  the  following  partial 
differential  equation  with  respect  to  ip,  where  the  solution  to  this  equation  is  the 
desired  Newton  step  ipN, 

C{aip  +  dipx  +  eipy)  -  dx  [C{dip  +  bipx  +  fipy)]  -  dy  [C{eip  +  fipx  +  cipy)] 

+  aC{ip)  +  dC{ipx)  +  eC{ipy)  -  dx  [dC{ip)  +  bC{ipx)  +  fC{ipy)]  (3-9) 

-  dy  [eC{iP)  +  fC{iPx)  +  cC{iPy)]  =  -£  {If  -  dx{lfj  -  dy{If,)))  . 

3.1.  Numerical  Details.  Having  derived  the  basic  formulation  of  the  general¬ 
ized  Newton  method,  and  having  derived  the  corresponding  equation  to  compute  the 
Newton  step  direction  (Equation  (3.9)),  we  proceed  now  with  the  numerical  details 
of  the  algorithm. 

The  basic  Newton  method  is  an  iterative  process  (steepest  descent  algorithm), 
where  at  every  iteration  n,  the  Newton  step  is  added  to  the  current  minimum  point 
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estimation.  In  the  case  of  the  damped  Newton  method,  a  step  size  At  multiplies  the 
step  direction  dn  (below,  when  we  return  to  the  variational  case,  the  step  direction  is 

obtained  from  solving  Equation  (3.9)).  For  the  simple  case  of  functions,  we  have  [5]: 

Damped  Newton  Algorithm 

1 .  Choose  a  starting  point  xq  G  domain(/) 

2.  Repeat  n=0,l,2,... 

3.  Compute  Newton  step  =  — [Hessian/(a;„)]“^V/(a:„) 

4.  Choose  At  by  stauidard  backtracking  line  search 
5  .  Update  Xn+i  =  a;„  +  At(t„ 

6.  Until  II V/(a:„+i)||  is  sufficiently  small. 


The  iterative  algorithm  that  we  use  in  this  study  is  an  extended  version  of  this 
dumped  Newton  method.  It  is  based  on  the  work  of  Steinhaug  [18],  where  the  New¬ 
ton  method  is  solved  subject  to  the  trust-region  constraint.  The  trust-region  A  is 
determined  at  every  iteration  (see  below),  and  the  calculation  of  the  Newton  step 
dn  is  performed  by  the  truncated  conjugate  gradient  algorithm  with  trust-region. 
Whenever  we  encounter  a  negative  Hessian,  then  we  move  to  the  boundary  of  the 
trust-region.  Steinhaug  proved  that  with  this  approach,  the  quadratic  sequence  of 
the  Taylor  expansion  of  real- valued  functions, 

Qn  ■=  f{xn)+  <  V/(a;„)  \dn>  +^<  Hessian/(a;„)d„  |  dn  >, 
is  strictly  decreasing,  and 


liminf  |jV/(x„)||  =  0. 

n—*oo 
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In  this  paper  we  extend  the  algorithm  of  [18]  to  the  variational  framework,  with 
the  proposed  generalized  inner  product  (we  use  standard  finite  difference  schemes 
to  evaluate  numerical  derivatives).  In  the  variational  setting,  the  Newton  step  at 
iteration  n  is  denoted  by  (and  is  obtained  by  solving  Equation  (3.9)).  In  the 
following  variational  trust-region  algorithm,  we  calculate  the  Newton  step  and  update 
the  trust  region  A„  at  every  iteration.  The  computation  of  ipn  is  performed  using 
the  truncated  conjugate  algorithm  with  trust  region  A„. 


Variational  Trust-Region  Algorithm 

1 .  Initialize  /g,  Ao  =  A»l,  0<£<Cl,  0<a2<Q;i<l,  72<l<7i- 

2.  Repeat  n=0,1.2,... 

3.  Solve  (3.9)  by  the  truncated  conjugate  gradient  algorithm  with 
trust-region  A„ ,  obtaining  the  direction  . 

4.  Choose  At  by  steuidard  backtracking  line  search  [5] . 

5 .  Set 

HU  +  AtV^„)  -  Hfn) 

m(At^/>„) 

6.  If  p„>a2,  then  /„+i  At^/’n ,  otherwise  /„+i  . 

7.  If  pn>ai,  then  A„+i  :=  min(7i  ||^/>„||£,  A) ,  otherwise  A„+i  :=  72||V'n||£  • 

8.  Until  (iir+i-rik  <e||r||c). 


Step  3  of  the  Variational  Trust-Region  Algorithm  is  the  calculation  of 
the  Newton  step  ipn  as  the  solution  of  Equation  (3.9)  by  the  conjugate  gradient 
method.  The  following  Truncated  Conjugate  Gradient  Algorithm  with 
Trust- Region  is  a  detailed  description  of  this  stage.  In  the  cases  where  the  Hessian 
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is  not  positive  definite  (line  2  of  Truncated  Conjugate  Gradient  Algorithm 
with  Trust-Region  below),  or  the  norm  of  f/'  exceeds  the  trust  region  A„  (line  5  of 
the  algorithm),  we  recalculate  tpn  such  that  HV’nllc  =  (project  to  the  boundary 
of  the  trust  region) .  The  Hessian  at  /„  is  denoted  as  Hf^  and  it  has  the  structure  of 
the  matrix  in  Equation  (3.3). 


Truncated  Conjugate  Gradient  Algorithm  with  Trust-Region 

1.  Initialize  tpo  :=  0,  ro  :=  -£  (V/J^(/„)) ,  Vq  :=  -tq,  i  :=  0,  ^  <  1, 
MaxLoops>l . 

2 .  if  <  Vi\Hf^Vi  >c  <  0  goto  11 . 

3.  ai  :=<n\ri>c  /  <Vi\'Hf^Vi>c- 

4.  V'i+i  :=  V’i  + 

5.  if  llf/'i+ilk  >  A„  goto  11. 

6 .  n+i  :=  Ti  -  aiTif^  Vi . 

7.  if  |jri+i||£  /  ||ro||c  <  ^  or  t  >  MaxLoops  set  ipn  = 'fpi+i  and  terminate. 

8.  A  :=<  A+i|a+i  >£  /  <  aIa  >£  • 

9.  v^+i  :=  ri+i  +  PiVi . 

10  .  Set  i  :=  i  +  I  auid  goto  2 . 

11.  Compute  T  >  0  such  that  ipn  =  AT  satisfies  H'i/'nllc  =  A„  and 

terminate . 


While  estimating  the  minimizers  of  functionals  in  the  variational  setting,  as  was 
explained  in  Section  2,  there  are  several  necessary  conditions  to  attain  a  relative  mini¬ 
mum.  Whenever  the  Legendre  or  Jacobi  conditions  are  not  satisfied  (Lemma  1),  there 
is  no  guarantee  for  the  second  variation  to  be  positive  definite,  and  we  therefore  do 
not  necessarily  converge  towards  a  minimum  point.  The  suggested  method  alleviates 
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this  difficulty  by  using  the  trust-region  constraint,  such  that  the  solution  is  projected 
to  the  boundary  of  the  trust-region  in  the  cases  where  the  second  variation  is  not  pos¬ 
itive  definite.  This  method  has  additional  computational  efficiency  advantages,  while 
calculating  the  Newton  step  i/'n  in  the  conjugate  gradient  method,  the  algorithm  ter¬ 
minates  whenever  the  norm  of  'ipn  exceeds  the  trust  region.  In  the  next  two  sections 
we  demonstrate  the  proposed  method  with  two  different  cost  functionals  relevant  in 
image  processing.  We  show  that  in  addition  to  the  above  advantages  of  the  proposed 
algorithm,  the  selection  of  the  operator  C  plays  a  significant  role  in  the  optimization 
process. 

4.  Geodesic  Active  Contours.  As  a  first  example,  we  address  the  classical 
geodesic  active  contours  framework  for  image  segmentation.  In  this  framework,  a 
contour  is  evolved,  via  the  minimization  of  a  geometric  energy,  toward  the  boundaries 
of  the  objects  of  interest  [6,16,22].  This  is  the  main  problem  that  was  addressed  in  [19] 
via  a  modified  gradient  descent  flow  with  Sobolev  norm.  Here  we  show  that  following 
the  Newton  framework  developed  in  Section  3,  further  significant  improvements  are 
obtained,  both  at  the  computational  efficiency  and  quality  of  results  levels. 

Let  u{x)  :  ^  M’*'  denotes  the  observed  image  where  we  are  interested  in 

detecting  objects.  The  deforming  contour  is  implicitly  represented  by  the  zero  level 
set  of  a  function  C  ^  M  [14].  As  an  example,  we  define  the  following 

energy  (see  [7]):® 


:=  [  Ai(u  -  ci)^i7((())  -I-  A2(w  -  02)^  (1  -  H{4))) 
Ja 

+  5(|Vm1)  6{<P)\\/{<P)\dx, 


(4.1) 


®Here  we  use  the  standard  level-set  notation  if)  instead  of  /  in  the  generic  functional  structure 
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where 


5(|Vu|) 


r  I 

1  +  |Vu|2/A  ' 


fXjV,  \,  Xi,  \2  G  M+,  ci,C2  G  M,  H{-)  is  the  heaviside  function,  and  S{z)  =  j^H{z) 
is  the  Dirac  delta  function  in  the  sense  of  distributions.  The  goal  is  to  minimize  Ti 
with  respect  to  (j),  ci  (the  average  gray  value  inside  the  object  of  interest),  and  C2 
(the  average  grey  value  of  the  background),  obtaining  the  desired  contour  (object 
boundary)  by  the  zero  level  set  of  (f). 


Following  Chan  and  Vese,  [7],  the  heaviside  function  is  approximated  as  (0  <  e  <C 


1) 


He{x) 


1 

2 


/  2 
1  H —  arctan 

V 


and 


^e(a;) 


1  e 

TT  e2  +  x2 


The  functional  (4.1)  is  alternately  optimized  between  ci,  C2  and  (j).  The  scalars  ci  and 
C2  are  easily  computed  by 


J^uH{(j}n)dx  _  J^U{1  -  H{<j3n))dx 

J^H{(j)n)dx’  -  H{(j)n))dx  ' 


where  </>„  denotes  the  level  set  function  at  iteration  n.  The  first  variation  with  respect 
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to  (j)  is 


<  \i^>=  SM 


Xl{u  -  Ci)^  -  A2(u  -  02)^  -  V 


1±\ 


Ip  dx. 


Thus,  the  gradient  at  iteration  n  is 


Cl)" 


A2(u 


C2)"  -  V  • 


(4.2) 


The  Hessian  which  is  the  second  variation  matrix  of  the  quadratic  form  (3.3), 


is  given  by 


= 


^"(</>)  [•^i(/  -  Ci)^  -  A2(/  -  C2f  +  9\V(p\] 

0 


aKW’P^  aKW'Py 


|V0| 

gK(<f>)<l>v 

\^<P\ 


+ 


^  0  0  0  ^ 


0  Rii  Ri2 

0  R21  R22 


|v<>| 

0 

0 


(4.3) 


where  the  matrix  R  :=  Rij  {i,j  =  1,2)  takes  the  form 


R  = 


|V(()|3/2 


9Se{(p)(pl  -9Se{(p)(px<Py 

-9^e{<P)(px(py  9^e{(p)(pl 


(4.4) 


The  matrix  R  is  the  vectorial  version  of  the  function  R(x)  as  was  previously  addressed 
in  the  Legendre  condition  (Section  2,  Lemma  1).  This  matrix  is  clearly  indefinite, 
and  therefore  the  Legendre  condition  is  not  satisfied.  Thus,  by  Lemma  1,  we  can  not 
guarantee  that  the  second  variation  is  positive  definite.  Using  the  proposed  algorithm. 
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whenever  a  negative  Hessian  (second  variation)  is  encountered,  the  solution  is  moved 
to  the  boundary  of  the  trust-region  and  the  minimization  becomes  well-possed. 


The  above  first  and  second  variations  of  the  geodesic  active  contour  functional  are 
used  in  the  calculation  of  the  Newton  step  in  the  suggested  variational  trust-region 
algorithm  as  introduced  in  Section  3.  The  Newton  step  ipn,  is  determined  by  means 
of  the  Truncated  Conjugate  Gradient  algorithm  with  Trust-Region,  where 
the  gradient  ((()„)  is  given  by  Equation  (4.2)  and  the  Hessian  is  calculated 
according  to  equations  (4.3)  and  (4.4)  a,t  (j)  =  (pn- 

Due  to  intrinsic  noise  in  real  data,  using  the  standard  inner  product  in  these 
first  and  second  variations  results  in  a  noisy  evolving  level  set  function,  both  in  the 
case  of  the  classical  gradient  descent  method  and  the  case  of  the  classical  Newton 
method.  This  is  due  to  the  high  (noisy)  gradients  caused  by  the  noise,  and  the  fact 
that  the  geodesic  (geometric)  active  contour  functional  is  minimized  along  prominent 
gradients.  A  much  more  promising  result  was  obtained  using  the  Sobolev  gradient 
descent  flow  [19],  and  this  is  further  improved  with  the  here  proposed  generalized 
Newton  method. 

For  the  here  introduced  generalized  Newton  method,  the  new  inner  product  is 
designed  with  the  smoothing  operator  which  is  a  convolution  with  a  Gaussian 
kernel  ha  of  variance  a, 


<  u  I  u  >Cs  '=<  ha  u  \  V  >  . 

Theorem  1.  The  operator  Cg  defined  as  the  convolution  with  a  Gaussian  of 
width  a,  LgU  :=  ha  *  u,  is  self-adjoint  and  positive  definite. 

Proof.  See  appendix  A.  □ 
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This  operation  smoothes  the  level  set  function  (j)  in  the  generalized  Newton 
method  and  reduces  high  perturbations.  The  obtained  results  are  improved  even 
when  compared  to  the  Sobolev  gradient  descent  method,  see  below,  with  the  addi¬ 
tional  advantage  of  computational  efficiency. 

In  the  following  examples,  the  convergence  criteria  was  set  to  /  — 

H{(j>n+i\)  dx  <  10.  To  make  a  fair  comparison,  the  standard  Newton  method  (L^)  was 
performed  subject  to  the  trust-region  constraint  as  well.  We  show  that  despite  this 
regularization,  the  generalized  Newton  method  {Cs  with  trust-region)  yields  better 
experimental  results. 

The  first  example.  Fig.  4.1,  presents  a  synthetic  shapes  image  with  additive  Gaus¬ 
sian  noise  of  5.36  dB  SNR.  Here  ^  =  2,  v  =  2,  X  =  0.007,  Ai  =  A2  =  0.007,  and  the 
standard  deviation  of  the  smoothing  kernel  was  set  to  cr  =  1.5.  The  level  set 
function  was  initialized  as  an  arbitrary  cone. 

In  the  dancer  example.  Figure  4.2,  we  added  a  Gaussian  noise  with  18.21  dB 
SNR.  Parameters  were  set  to  ^  =  6.5,  v  =  5.5,  A  =  0.5,  Ai  =  A2  =  0.5,  and  ct  =  1. 
Using  Sobolev  gradient  descent  yields  good  but  non-accurate  segmentation  (Figure  4.3 
(a)-(d)).  The  Newton  method  with  trust-region  results  are  similar  to  those  of  the 
generalized  Newton  method,  although  the  latter  one  is  cleaner  (Figure  4.3  (e),(f)). 
As  we  explained  earlier,  the  trust-region  constraint  stabilizes  the  solution  even  though 
the  Hessian  is  not  positive  definite.  In  this  specific  example,  the  generalized  Newton 
method  with  the  smoothing  norm  did  not  significantly  outperformed  the  standard 
Newton  method  with  trust-region.  Nevertheless,  In  the  next  two  examples  we  will  see 
that  the  smoothing  norm  does  make  a  difference  in  the  segmentation  results. 

In  the  third  example  we  segmented  the  letters  of  an  old  newspaper  (Figure  4.4). 
The  image  is  naturally  degraded  by  film-grain  noise  and  the  segmentation  in  this  case 
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(a)  -  Standard  gradient  descent 


(c)  -  Sobolev  gradient  descent 


(b)  -  Standard  Newton 


(d)  Proposed  Newton  with  Cs 


Fig.  4.1.  Segmentation  by  geodesic  active  contours,  (a)  Classical  gradient  descent  method,  (b) 
Newton  method  with  trust-region,  (c)  Gradient  descent  with  the  Sobolev  norm  [19].  (d)  Generalized 
Newton  method  with  a  smoothing  operator  Cs.  The  red  curves  indicate  the  obtained  segmentation. 


is  very  challenging.  The  parameters  were  set  to  /r  =  5,  :/  =  5,  A  =  0.1,  Ai  =  A2  =  0.5, 
and  a  =  0.8.  The  segmentation  results  are  shown  in  Figure  4.5.  Gradient  descent  and 
Newton  methods  both  yield  very  noisy  segmentation  results.  The  segmentation  using 
the  generalized  Newton  method  is  cleaner  (see  the  letters  UL),  and  more  accurate 
than  the  Sobolev  gradient  descent  result.  This  can  be  noticed  in  the  little  subtitle 
(Figure  4.5  (e),(f)). 

In  the  last  example  we  segment  an  ultrasound  image  which  is  known  to  be  a 
very  difficult  test  (leading  techniques  for  segmenting  ultrasound  data  via  geometric 
active  contours  add  shape  priors).  Since  the  image  is  very  noisy,  we  increased  the 
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(a)  -  Standard  gradient  descent 


(b)  -  Standard  Newton 


(c)  -  Sobolev  gradient  descent 


(d)  Proposed  Newton  with  Cs 


Fig.  4.2.  Segmentation  by  geodesic  active  contours,  (a)  Classical  gradient  descent  method,  (b) 
Newton  method  with  trust-region,  (c)  Gradient  descent  with  the  Sobolev  norm  [19].  (d)  Generalized 
Newton  method  with  a  smoothing  operator  Cs.  The  red  curves  indicate  the  obtained  segmentation. 


smoothing  kernel  standard  deviation  to  cr  =  6.  The  rest  of  the  parameters  were  set 
to  ^  =  3,  =  0,  A  =  1,  and  Ai,  A2  =  1.  As  can  be  seen  in  Figure  4.6,  the  effect  of 

the  smoothing  norm  is  very  significant,  even  when  compared  to  the  Sobolev  gradient 
descent  method. 

In  addition  to  the  improved  segmentation  results  by  using  the  generalized  Newton 
method,  the  computational  efficiency  of  the  algorithm  has  to  be  considered  as  well. 
We  present  in  Table  4.1  the  running  time  for  the  tested  methods.  The  program  was 
implemented  with  the  MATLAB  environment  on  a  2Ghz  PC.  Except  for  the  small 
image  of  artificial  shapes  example,  which  is  relatively  an  easy  one,  significant  difference 
in  running  time  can  be  observed  between  Newton-like  methods  and  gradient  descent¬ 
like  methods.  The  generalized  Newton  method  is  a  little  bit  slower  than  the  classical 
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Fig.  4.3.  Zoomed  regions  of  the  geodesic  active  contours  segmentation.  The  top  row  shows 
the  outcome  of  the  Sobolev  gradient  descent  method  (a),(c)  and  the  standard  Newton  method  with 
trust-region  (e).  The  bottom  row  shows  the  outcome  of  the  proposed  generalized  Newton  method 
using  the  Cs  operator. 


Fig.  4.4.  Old  newspaper  naturally  degraded  by  film-grain  noise.  The  red  circle  is  the  initial 
active  contour. 


Newton  one  because  of  the  convolution  operator  in  the  norm  calculations. 


5.  Image  Deblurring.  In  the  next  example  of  our  generalized  Newton  method, 
we  look  at  the  variant  of  the  Mumford-shah  regularizer  for  color  images  deblurring  [2, 
3,12,13,17]: 


^2{r.v) 


g'^Y‘dx  +  13  /  v'^\\V f\\dx  +  ! 


4e  J 
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j  dx, 
(5.1) 


(c)  -  Sobolev  gradient  descent  (d)  Proposed  Newton  with  Cs 


'S’oS’Es  llo  S53gi)«loyo 


(e)  -  Sobolev  gradient  descent  (zoom) 


(f)  Proposed  Newton  with  Cg  (zoom) 


Fig.  4.5.  Segmentation  by  geodesic  active  contours,  (a)  Classical  gradient  descent  method,  (b) 
Newton  method  with  trust-region,  (c)  Gradient  descent  with  the  Sobolev  norm  [19].  (d)  Generalized 
Newton  method  with  a  smoothing  operator  Cs-  (e)  Gradient  descent  with  the  Sobolev  norm  [19]  - 
zoomed  region,  (f)  Generalized  Newton  method  with  a  smoothing  operator  Cs  -  zoomed  region. 


where  a,f3,£  G  M'*',  and  c  G  {R,G,B}.  The  observed  (blurred)  vectorial  image  is 
denoted  by  g,  h  is  the  (known)  blur  kernel,  and  /  is  the  (unknown)  clean  vectorial 
image.  The  auxiliary  scalar  function  v{x)  represents  the  edges  -  it  is  close  to  1  in  the 
smooth  parts  of  the  image  and  close  to  0  near  the  edges.  {g,f,v  are  all  defined  on 
n  C  K^.)  The  magnitude  of  the  vectorial  gradient  is  given  by  the  Frobenius  norm, 


liv/ll 


ce{R,G,B} 
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(c)  -  Sobolev  gradient  descent 


(d)  Proposed  Newton  with  Cs 


Fig.  4.6.  Segmentation  by  geodesic  active  contours,  (a)  Classical  gradient  descent  method,  (b) 
Newton  method  with  trust-region,  (c)  Gradient  descent  with  the  Sobolev  norm  [19].  (d)  Generalized 
Newton  method  with  a  smoothing  operator  Cs.  The  red  curves  indicate  the  obtained  segmentation. 


image 

size 

Newton  with  Cg 

Newton 

gradient  descent 

Sobolev  gradient  descent 

shapes 

187  X  216 

2.05 

1.61 

1.96 

1.66 

dancer 

535  X  341 

10.6 

8.28 

18.79 

25.56 

newspaper 

801  X  480 

24.67 

15.6 

42.8 

92.39 

ultrasound 

315  X  335 

5.47 

4.56 

41.9 

63.04 

Table  4.1 

Running  time  [secs]  of  the  geodesic  active  contours  algorithm  with  different  optimization  meth¬ 
ods. 


The  cost  functional  depends  on  the  variables  /°,  c  G  {R,  G,  B}  and  v  (with  the 
corresponding  variations  and  77),  where  the  optimization  process  is  performed  al¬ 
ternately.  To  make  the  functional  differentiable  with  respect  to  /,  the  norm  ||V/|| 
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is  replaced  by  the  modified  norm  ^||V/P  +  /r,  ^  <C  1.  Thus,  the  first  variation 
with  respect  to  v  is  given  by 


<  I  V  >= 


2/3ti'\/||  V/P  +  fi  +  ^  —  2ea\7‘^v 

2s 


rjdx,  (5.2) 


and  the  first  variation  with  respect  to  /'^  is 


<  V/cJc-2  I  V’"'  >=  f 

Jo, 


{h*  —  g‘^)  *  h{—x)  —  aV 


PVp 


VIIV/P  +  M, 


ip‘^dx.  (5.3) 


The  gradient  at  iteration  n  is  therefore 


^f^^2{rn)  =  ih*f^-  P)  *  h{-x)  -  aV  ' 


VWTn 


After  discretization  by  a  standard  finite  difference  scheme,  the  integrand  of  (5.2) 
can  be  represented  in  matrix  form,  Av  =  B,  where  A  is  sparse.  As  a  result,  the 
optimization  with  respect  to  v  is  effectively  performed  via  the  Generalized  Minimal 
Residual  algorithm  (MATLAB:  gmres).  We  used  the  proposed  generalized  Newton 
method  only  for  the  optimization  of  f‘^.  The  Hessian  is  given  by 


Hf.= 


h{x)  *  h{—x)*  0  0 

0  0  0 
0  0  0 


^  ^  0  0  0  ^ 


0  Rii  Ri2 

0  i?2i  R22 


where  R  :=  Rij  {i,j  =  1,  2)  takes  the  form 


R  = 


^  (I|V/||2+^)3/2 


—fiv 


V 


(I|V/||+M)3/2 


(iv 


(I|V/||2+^)3/2 
(I|V/||2+^«)3/2  J 


(5.4) 


(5.5) 
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Like  in  the  previous  functional,  the  matrix  R  is  indefinite,  and  therefore  the 
Legendre  condition,  and  in  turn  the  necessary  condition  of  positive  definite  Hessian 
are  not  satisfied  here  as  well.  The  proposed  algorithm  is  therefore  valuable  by  the 
stabilizing  property  of  the  trust-region  constraint.  In  addition,  the  deblurring  process 
is  known  to  be  an  ill-posed  inverse  problem,  and  as  the  numerical  simulations  show, 
standard  Newton  methods  result  in  poor  restoration  results.  Significant  improvement 
is  accomplished  using  a  variant  of  the  Sobolev  norm 

<u\v>H'=^i  /  P{x)  [u{x)  ■  v{x)]dx  +  X2  /  Vu{x)  ■  Vv{x)dx,  P{x)>0, 

Jq  Jn 

where  P{x)  :  H  — >  K  and  Ai,  A2  €  K“''.  This  norm  leads  to  the  Hamiltonian  operator 
Ch  =  XiP{x)  -  A2V2. 

Theorem  2.  The  Hamiltonian  operator  Ch  =  XiP(x)  —  A2V^  with  P(x)  >  0  is 
self-adjoint  and  positive  definite. 

Proof.  See  appendix  B.  □ 

This  is  the  operator  we  now  use,  instead  of  the  classical  for  the  proposed 
generalized  Newton  method  for  addressing  the  variational  debluring  problem.  In  the 
first  experiment.  Figure  5.1,  the  blurred  220  x  250  dog  image  is  degraded  by  an  out- 
of-focus  kernel.  Further  amount  of  synthetic  blur  with  a  pill-box  kernel  of  radius 
2.4  was  added  in  order  to  increase  the  blur  effect  for  ease  of  visualization  and  to 
make  the  problem  even  more  challenging.  Debluring  was  performed  with  3  different 
methods.  The  parameters  of  (5.1)  were  set  to  (3  =  0.01,  a  =  10“®,  and  e  =  10“^. 
The  recovered  image  using  the  classical  Newton  method,  with  added  trust-region,  is 
shown  in  (c).  Poor  restoration  was  obtained  in  this  case.  The  proposed  generalized 
Newton  method  with  the  smoothing  norm  Cg  is  shown  in  (d).  As  the  smoothing 
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(c)  -  Standard  Newton  (d)  -  Newton  with  Cg 

Fig.  5.1.  Deblwrring  of  the  dog  image  with  different  Newton-like  methods,  (a)  Blurred  image, 
(h)  Recovered  image  using  trust-region  Newton  with  Hamiltonian  norm,  (c)  Recovered  image  us¬ 
ing  standard  trust-region  Newton  method,  (d)  Recovered  image  using  trust-region  Newton  with  a 
smoothing  norm. 


kernel  is  increased,  the  more  blurred  is  the  recovered  image.  This  can  be  explained 
by  the  fact  that  smoothing  the  incremental  image  ipn  prevents  the  desired  sharpening 
operation.  Better  results,  (b),  are  achieved  using  the  proposed  Hamiltonian  operator 


(5.6) 


where  Ai  =  1,  A2  =  24,  and  Vn(x)  is  the  edge  indicator  function  calculated  at  iteration 
n.  The  idea  behind  the  selection  of  this  operator  is  that,  unlike  the  uniform  norm 
used  in  the  classical  Newton  method,  here  we  restrict  the  inner  product  to  the  image 
edges.  The  contribution  of  the  first  term  of  (5.6)  is  due  to  the  presence  of  edges. 


28 


Fig.  5.2.  Deblurring  of  the  clown  image  with  different  Newton-like  methods,  (a)  Blurred  image, 
(b)  Recovered  image  using  trust-region  Newton  with  Hamiltonian  adaptive  norm,  (c)  Recovered 
image  using  standard  trust-region  Newton  method,  (d)  Recovered  image  using  trust-region  Newton 
with  a  smoothing  norm. 


while  the  Laplacian  operator  amplifies  high  gradients.  In  addition,  the  operator  is 
adaptively  updated  at  each  iteration  as  the  edge  function  v(x)  gets  more  accurate. 
This  shows  the  additional  flexibility  of  our  proposed  method,  the  inner  product  can 
be  adapted  to  the  problem  at  hand. 

In  the  second  example,  Figure  5.2,  the  330  x  291  clown  image  was  additionally 
blurred  by  an  out-of- focus  kernel  of  radius  1.5.  Like  in  the  previous  example,  the 
generalized  Newton  method  with  the  Hamiltonian  operator  yields  the  best  results 
(note  for  example  the  nose,  eyes,  and  hair).  The  parameters  set  were  selected  as  in 
the  dog  example. 
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6.  Discussion.  In  this  work  we  have  extended  the  classical  Newton  method  by 
using  different  inner  products  in  the  variational  framework,  extending  the  gradient- 
descent  work  in  [8,  19]  to  the  more  efficient  optimization  framework  provided  by 
Newton-type  methods.  The  experimental  results  show  the  advantage  of  the  method 
in  computational  efficiency  and  noisy  data  performance.  The  selection  of  the  most  ap¬ 
propriate  inner  product  associated  to  a  particular  functional  is  still  an  interesting  open 
problem  for  future  research.  One  might  pre-analyze  the  cost  functional  and/or  the 
given  data,  and  design  an  inner  product  which  would  yield  optimal  results.  Another 
research  direction  may  incorporate  non-ffat  manifolds  instead  of  Euclidean  spaces 
while  benefiting  from  the  efficiency  and  flexibility  of  the  generalized  Newton  method. 

Appendix  A. 

Theorem  1.  The  operator  Cg  defined  as  the  convolution  with  a  Gaussian  of 
width  (7,  CgU  =  hu  *  u,  is  self-adjoint  and  positive  definite. 

Proof.  Let  H  he  a,  convolution  operator,  i.e.,  Hu{x)  =  h{x)  *  u{x),  where  x  € 

Its  adjoint  operator  H*  is  defined  by  <  v,Hu  >=<  H*v,u  >.  Here, 

<  H*v,u  >=<  V,  Hu  >  =  /  v-{h*u)dx=  /  [h{—x)*v{x)]-u{x)dx. 

7r2 

Hence,  H*v{x)  =  h{—x)  *  v(x).  In  the  case  of  a  Gaussian  kernel  h{x)  =  ha{x)  = 
h„{—x)  =  h{—x),  and  therefore  the  operator  is  self  adjoint. 

Let  m(^)  and  h{f)  be  the  Fourier  transforms  of  u{x)  and  h^{x)  respectively,  and 
let  h*{^)  be  their  complex  conjugates.  For  a  real  function  u  :  ^  M,  u{—^)  = 


<  U,CsU  >  = 


i{x)[h^{x)  *  u{x)]dx  =  u{f)  * 
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(A.l) 


Substituting  the  convolution  operator  yields 


<  U,CsU  >  = 


=  I  ui-aHaHOde 


JC=o 


=  /  /  KnrMm'>o 

JK2  JR2 


for  all  Gaussian  kernels  and  functions  u  that  are  not  identically  zero,  which  proves 
that  £s  is  positive  definite.  □ 

Appendix  B. 

Theorem  2.  The  Hamiltonian  operator  Cr  =  AiP(a;)  —  A2V^  with  P{x)  >  0  is 
self-adjoint  and  positive  definite. 

Proof.  We  first  show  that  the  operator  is  self-adjoint.  Using  integration  by  parts 
and  Neumann  boundary  conditions, 


<v,£hu>=\i  /  vP{x)udx  —  X2  /  vV^udx 

Jq  Jo. 

=Ai  /  uP{x)vdx X2  /  X/v-X/udx 

Jq  Jo. 

=Ai  /  uP{x)vdx  —  X2  /  u\7^vdx  =<  £hv,u  >, 
Jq  Jq 


and  the  operator  is  self-adjoint. 

We  proceed  to  show  that  £h  is  positive  definite: 


<  u,£hu  >= 


Ai  /  vJP{x)dx  —  X2  /  uV^udx  =  Xi  /  vJP{x)dx X2  /  >  0. 

JCI  Jfl  Jfl  Jft 


□ 
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